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Abstract. We propose a numerical method for studying the cogrowth of 
finitely presented groups. To validate our numerical results we compare them 
against the corresponding data from groups whose cogrowth series are known 
exactly. Further, we add to the set of such groups by finding the cogrowth 
series for Baumslag-Solitar groups BS(N, N) = (a,b\a N b = ba N ) and prove 
that their cogrowths are algebraic numbers. We have been unable to find the 
cogrowth series for other Baumslag-Solitar groups, but we have found recur- 
rences that yield the first few terms of the cogrowth series exponentially faster 
than is possible by naive methods. Finally we apply our numerical method 
to several presentations of Thompson's group F and our results give strong 
indication that the group is not amenable. 



1. Introduction 

In this article we consider the function that counts the number of trivial words in 
a finitely presented group, the so-called cogrowth function. The exponential growth 
rate of this function is simply called the cogrowth and is intimately related to the 
amenability of the group |12l I19| . Amenability is an active area of current research, 
and cogrowth is just one of many characterisations. The amenability of one group 
in particular - Richard Thompson's group F - has been the subject of intensive 
research and conjecture. 

In this article we propose a new numerical technique to estimate the cogrowth 
of finitely presented groups, based on ideas from statistical mechanics, which we 
show to be quite accurate in predicting the cogrowth for a range of groups for which 
the cogrowth series and/or amenability is known: these include Baumslag-Solitar 
groups, a finitely presented relative of the Basilica group, and some free products 
studied by Kuksov [25 . We apply the method to several different presentations 
for Thompson's group F, and the evidence obtained points strongly towards the 
conclusion that F is not amenable. 

The present article builds on previous work of a subset of the authors [H] , where 
various techniques, also based in statistical mechanics, were applied to the problem 
of estimating and computing the cogrowth for Thompson's group F. This in turn 
built on previous work of Burillo, Cleary and Wiest [8j, and Arzhantseva, Guba, 
Lustig, and Preaux [3J, who applied experimental techniques to the problem of 
deciding the amenability of F. 
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For the benefit of readers outside of group theory, we start with a precise defini- 
tion of group presentations and cogrowth. 

Definition 1.1 (Presentations and trivial words). A presentation 

(oi, . . . , afc|i?i, . . . , R m ) 

encodes a group as follows. 

• The letters ai are elements of the group and are called generators, and the 
i?i are finite length words over the letters a\, . . . , ak, a] -1 , . . . , a^ 1 and are 
called relations or relators. 

• A word in the letters a%, . . . , ak, a^ 1 , . . . , a^ 1 is called freely reduced if it 
contains no subword a- 1 aJ 1 . 

• The set of all freely reduced words, together with the operation of concate- 
nation followed by free reduction (deleting a^aj 1 pairs) forms a group, 
called the free group on the letters {ai, . . . , ak}, which we denote by 
F(a 1; ...,a fc ). 

• Let N(Ri, . . . , R m ) be the normal subgroup which contains all words of the 

m 

form Y\ PjRjPj 1 where pi is any word in the free group, and Rj is one of 

3=1 

the relators or their inverses. This subgroup is called the normal closure 
of the set of relators, and is the smallest normal subgroup in F(a\, . . . , ak) 
that contains all words Ri , . . . , R m . 

• The group encoded by the presentation (<ii, . . . , ak\Ri, ■ ■ ■ , Rm) is defined 
to be the quotient group F(a\, . . . , ak)/N(R 1 , . . . , R m ). 

• ft follows that words in F(ai, . . . , ak) equals the identity element in G if 
and only if it lies in the normal subgroup N(Ri, . . . , R m ), and so is equal 
to a product of conjugates of relators and their inverses. 

We will make extensive use of this last point in the work below. We call a word 
in F(a\, . . . , ak) that equals the identity element in G a trivial word. 

The function c : N — > N where c(n) is the number of freely reduced words in 
the generators of a finitely generated group that represent the identity element is 
called the cogrowth function and the corresponding generating function is called 
the cogrowth series. The rate of exponential growth of the cogrowth series is the 
cogrowth of the group (with respect to a chosen finite generating set). Grigorchuk 
and independently Cohen [T2j[19] proved that a finitely generated group is amenable 
if and only if its cogrowth is twice the number of generators minus 1 . 

For more background on amenability and cogrowth see [231 ES). and Thompson's 
group F see [HI [TO] - The free group on two or more letters, as defined above, is 
known to be non-amenable. Also, subgroups of amenable groups are also amenable. 
It follows that if a group contains a subgroup isomorphic to the free group on 2 
generators (^(ai^) above), then it cannot be amenable. Thompson's group F 
has no such subgroup, but at the same time, no simple proof of amenability has 
been forthcoming - hence the intense interest in this example. 

The article is organised as follows. In Section [2] we adapt an algorithm designed 
to sample self-avoiding polygons to the problem of estimating the growth rate of 
trivial words in finitely presented groups. The algorithm we describe actually works 
when the group is finitely generated but has infinitely many relations - in this case 
its application is more subtle (in the way one samples relators from an infinite list). 
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To further validate our algorithm we test it on groups whose cogrowth series are 
known exactly. In Section [3j we add to this pool of results by finding the cogrowth 
series of the Baumslag-Solitar groups BS(N, N) = (a, b\a N b = ba N ) . We apply our 
algorithm and analyse the resulting data in Section [4] and summarise our results in 
Section [5] 



2. Metropolis Sampling of Freely Reduced Trivial Words in Groups 

The dynamical implementation of our algorithm is inspired by the BFACF algo- 
rithm [H HJ H] , which was developed to sample lattice self-avoiding walks and poly- 
gons from stretched Boltzmann distributions. The self-avoiding walk is a model of 
polymer entropy, a celebrated unsolved problem in polymer physics and chemistry 
[T51 fT7] . Details about the implementation of the BFACF algorithm can also be 
found in [231 EH]. 

Our algorithm will be implemented to sample words in a group G along a Markov 
chain using the Metropolis algorithm [3D] . States will be sampled by the algorithm 
by generating new states from a current state via elementary moves. These ele- 
mentary moves will be defined in more detail below - they are local changes made 
in a systematic manner to a freely reduced trivial word w to obtain a new freely 
reduced trivial word v. 

The approach is as follows: Let w n be the current state of the algorithm (so 
that w n is a freely reduced trivial word of G). Choose an elementary move from a 
set of available elementary moves and create a trial word w' n by implementing the 
elementary move on w n (where w' n is also a freely reduced trivial word) . Accept w' n 
as the next state in the Markov chain with probability P(u) n — > u)' n ), in which case 
the next state is w n+ i — w' n . If w' n is rejected, then the next state is by default 
u>„+i = w n . This rejection technique is characteristic of the Metropolis algorithm 
and it ensures that the sampling is aperiodic. 

This implementation samples words {w n } for n = 0, 1,2, . . . along a Markov chain 
which is initiated at a state wq. The initial state wq may be chosen arbitrarily, but 
must be a freely reduced trivial word of G. It is convenient to choose w from the 
set of relators of G. 

2.1. Elementary moves for sampling trivial words in a group G. Let G = 

(a%, d2, • • • • ■ ■ i Rmi ■ ■ ■ ) De a group on k generators with finite length 

relators Ri. The number of relators may be finite, or infinite. Let w be a freely 
reduced trivial word in {aj,, of , . . . , cifc, a^ 1 }. Denote the length of w by |to|. And 
finally, let S be the set of relators Ri, their inverses R^ 1 and all cyclic permutations 
of relators and their inverses. Note that S is an infinite set if and only if G has an 
infinite set of relators. 

Suppose that we have sampled along a Markov chain {w n } and that the current 
state is a freely reduced trivial word w — w n of length \w\ — \w n \. A new trivial 
word w' is constructed from w by choosing from the following two elementary moves: 

• Conjugation — Let x to be one of the 2fc possible generators (and their 
inverses) chosen uniformly and at random. Put w' — xwx^ 1 and perform 
free reductions on w' to produce w". 
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• Insertion — Let R G S be one of the relators or their inverses or any cyclic 
permutations of those relators or their inverses^ Choose an integer m € 
{0, 1, . . . , with uniform probability and partition w into two subwords 
u and v, with |it| = m. Form w' — uRv, and freely reduce this word to 
get w" . If m — 0, then R is prepended to w, and if m = \w\, then R is 
appended to w. 

The elementary moves are implemented by choosing a conjugation with probability 
p c , and otherwise an insertion. 

The two elementary moves produce freely reduced trivial words w" by acting 
on w. A Metropolis style Monte Carlo algorithm can be implemented using these 
moves provided that they are uniquely reversible. 

One may verify that conjugations are uniquely reversible. Unfortunately, inser- 
tions are not, and this must be accounted for in the implementation of the algo- 
rithm by conditioning the use of insertion moves such that they become uniquely 
reversible. 

We show by example that insertions are not reversible: Let R £ S and consider 
the insertion of R^ 1 to the right of R in the word a e Ra~ e . This will reduce the word 
to the empty word, but there is no elementary move which will produce a Ra 
from the empty word by inserting any relator on the empty word (here we assume 
a l Ra~ l are not relators). This difficulty can be overcome by rejecting proposed 
moves as a result of inserting R if it changes the length of the word by more than 
\R\. 

A second difficulty may arise with insertions, and we show again by example 
that an insertion may not be uniquely reversible, even if it it changes the length of 
a word by at most \R\. Consider the group Z 2 — (a,b \ bab~ 1 a~ 1 ) and insert the 
relator R = bab~ 1 a~ 1 into the word uba~ 1 b~ 1 aba~~ 1 v at the position marked by * 
below: 

(2.1) uba~ 1 b~ 1 * aba~ l v *—} uba~ 1 b~ 1 ■ bab~ 1 a~ 1 ■ aba~ 1 v H- uba~ 1 v 

This move can be reversed in 2 ways. First insert ba~ 1 b~ 1 a (which is a cyclic 
permutation of the inverse of a relation) at the * 

(2.2) u * ba~ 1 v i-> u ■ ba~ 1 b~ 1 a ■ ba~ 1 v 

and then we could also insert another relation of Z 2 , b~ 1 aba~ 1 , at the * 

(2.3) M&a -1 * v H- uba^ 1 ■ 6~ 1 afea _1 • v 

This will disturb the detailed balance condition required for Metropolis style algo- 
rithms with the result that the algorithm will sample from an incorrect stationary 
distribution. 

We show how to account for the above by modifying the insertion move as follows: 
Reject all attempted insertions of R € S in a word if either there are cancellations 
to the right, or if it changes the length of the word by more than \R\. Attempted 
insertions which neither cancel to the right, nor change the length of the word 
by more than \R\ will be called valid, and we call an insertion a left-insertion if 
cancellations of generators only occurs to the left and if the insertion is valid. 



^For example, in BS(2, 3) defined in Section|3] the relator a 2 ba 3 6 1 yields 2 X 7 = 14 possible 
choices. 
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• Left-Insertion — Let R € S be one of the relators or their inverses or any 
cyclic permutation of those relators and their inverses. Choose an integer 
m G {0,1,2,..., \w\} uniformly and partition w into two subwords u and 
v, with \u\ — to. If to = then prepend w' = Rw and note that this is 
valid only if there are no cancellations of generators. If this is valid, then 
put w" — w', otherwise put w" = w. If m = \w\, then append w' — wR 
and this is valid even if there are cancellations to the left. Freely reduce w' 
to obtain w" . Otherwise, form w' = uRv. If R cancels to the right with v 
then reject the proposed move and keep w. Otherwise, freely reduce w' to 
obtain w" . If \w"\ < \w\ — \R\ then reject the move (and keep w). 

Left-insertions are uniquely reversible, and are suitable as an elementary move in 
a Metropolis style Monte Carlo algorithm for sampling freely reduced trivial words 
in G. 

Lemma 2.1. Left-insertions are uniquely reversible. 

Proof. Let w — uv be a freely reduced trivial word in the group G and let R e S. 
Form w' — uRv via a left-insertion, where uom may be the empty word. 

• Suppose there are no possible cancellations to the left or right — then w" = 
uRv, and the move can be uniquely reversed by inserting (which must 
also be a relator in the group) to the right of R. This gives uRR~ 1 v i->- uv. 
Further cancellations cannot occur because w — uv was freely reduced. 
Note that this is unique because any other insertion must cancel R, and 
to do so would require cancellations to the right and so would not be a 
left-insertion. 

• Suppose there are some cancellations to the left when R is inserted in w. 
In particular, in this case one has w — u'sv and R — st for some freely 
reduced words u', s and t (where t may be the empty word). Inserting R 
to the right of s and freely reducing the word gives w" = u'tv (and t may 
be empty). This move is uniquely reversible by inserting = is to the 
right of t. This gives u'ttsv i-> ulsv = w. No further cancellations are 
possible because the original word was freely reduced. Again, by a similar 
argument, this move is unique — all other possibilities require cancellation 
to the right. 

□ 

The conjugation and left-insertion elementary moves can be implemented in a 
Metropolis algorithm to sample freely reduced trivial words in G. 

2.2. Metropolis style implementation of the elementary moves. Conjuga- 
tions and left-insertions may be used to sample along a Markov chain in the state 
space of freely reduced trivial words of a group G on k generators. 

The algorithm is implemented as follows: Let p c € [0, 1], a <G R and (3 G R + be 
parameters of the algorithm and assume that f3 is small. As above, let S be the set 
of all cyclic permutations of the relators and their inverses and recall that S may 
be finite or infinite. 

Define P, a probability distribution over S, so that P(R) is the probability of 
choosing Re S with J2nes = ^ Further, assume that P(R) > for all R e S 

and also that P(R) = P(R^ 1 ) (we shall eventually require these two conditions). 



6 



M. ELDER, A. RECHNITZER, E.J. JANSE VAN RENSBURG, AND T. WONG 



In the case that S is finite we choose P to be the uniform distribution, although 
we are free to choose other distributions. 

Suppose that w n is the current state, a freely reduced trivial word produced by 
the algorithm, and inductively construct the next state u> n +i as follows: 

o With probability p c choose a conjugation move and otherwise choose a 
left-insertion. 

o If the move is a conjugation, then choose one of the 2k possible conjugations 
randomly with uniform probability: Say that the pair (c, c _1 ) is chosen 
where c is a generator or its inverse. Put u = cw n c~ l and freely reduce u 
Construct w n+ i from w' and w n as follows: 

with probability p = min jl, j^^^P lw ' l ~ lwl }; 
otherwise. 



(2.4) w n+1 



If the move is a left-insertion, then choose an element R G S with prob- 
ability P(R). Choose a location m <E {0, 1, 2, . . . , \w n \} in the word w n 
uniformly. This is the location where the left-insertion will be attempted. 
Attempt a left insert of R at the location m. Construct w n +i as follows: 



(2.5) 



w 



w n , if the left-insertion of R is not valid; 
■„+!= lw', if R is valid and with probability p = min {l, ^^/3 |w ' l_H }; 
otherwise. 



w 



n ■ 



Let w and v be two words and suppose that v was obtained from w by a conju- 
gation as implemented above. Then the transition probability P r (w — > v) is given 

by 

(2 6) Pfa^) = ^f (H+1)1+ > HH 

since a conjugation is chosen uniformly from 2k possibilities, and provided that p < 
1 in equation (2.4 1. Otherwise, the transition probability of the reverse transition 
is P r (v —> w) = 1 /2k. This, in particular, shows the condition of detailed balance 
for conjugation moves: 

_1) 1+Q 

,(M~ 

which simplifies to the symmetric presentation 

(2.8) (M + l) 1+a 0Mp r (w -> v) = (\v\ + l) 1+a ^P r (v -> w). 



In the alternative case that w and v are two words and v was obtained from 
w by a left-insertion of R €E S as implemented above, the transition probability is 
given by 

(2.9) M^v) = ^( P±T ^) 

where the element R E S is selected with probability P(R), the location for the 
left-insertion of i? is chosen with probability l/(|w| + 1), and we have assumed 
(without loss of generality) that p < 1 in equation (2.5). 
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Similarly, the transition probability of v — > w via a left-insertion of R 1 E S is 

HR- 1 ) 



(2.10) P r (v->w) 



1 



This gives a second condition on the probability distribution P over S, namely that 
P(R) = P(R~ 1 ) for all elements R E S. In this event a comparison of the last two 
equations, and simplification, gives 

(2.11) (M + l) 1+a pWp r (w v) = (M + l) 1+a f3^P r (v -> w) 

as a condition of detailed balance for left-insertions. This is the identical condition 



obtained for conjugation in equation (2.8 1. The above is a proof of the following 
lemma. 

Lemma 2.2. Let {w n } be a Markov chain in the state space of freely reduced words 
in G, and suppose the transition of state w n to w n +i is due to a transition by a 
conjugation move with probability p c , and due to a left-insertion with probability 
q c = 1 — p c . Then the Markov chain samples from the stationary distribution 

(M + i) 1+ °/3 H 

over its state space, where M is a normalising factor. 

Proof. This lemma is a corollary of the Perron- Frobenius theorem (see [5] for exam- 



ple), and follows by summing the conditions of detailed balance in equations (2.8) 
and \2.ll\ over v. □ 



2.3. Irreducibility of the elementary moves. In this section we examine the 
state space of the Markov chain in Lemma |2.2| by determining the irreducibility 
class of trivial freely reduced words in G with respect to the elementary moves of 
the algorithm. 

The elementary moves above may be represented as a multigraph M on the 
freely reduced words of G: Two freely reduced words w, v form an arc wv for each 
elementary move (a conjugation or a left-insertion) which takes w to v. Since each 
elementary move is uniquely reversible, M may be considered undirected. The 
irreducibility class of a freely reduced trivial word w in G is the collection of freely 
reduced trivial words in the largest connected component M w of M which contains 
w. The algorithm will be said to be irreducible on freely reduced trivial words in 
G if the words in M w form exactly the family of freely reduced trivial words in G. 

Lemma 2.3. Consider the group G = (a±, . . . , ak\Ri ■ ■ ■ R m ■ ■ •) with k generators. 
If < p c < 1 and P(R) > for all R G S, then the elementary moves defined above 
are irreducible on the set of all freely reduced trivial words in that group. 

Proof. Consider a relator of G, say R\ E S. Observe that left-insertions can be 
used to change R\ into any other relator R m of G. Hence, all the relators R m of 
G are in the irreducibility class M of R\ . It follows that all cyclic permutations of 
the R mi and inverses and their cyclic permutations are also in M. Hence, S C M. 

Next, let C — (w n ) be a realisation of a Markov chain with initial state wq = Ri- 
All words w n sampled by C are obtained by conjugation or by left-insertions by 
elements of S, and so they are all trivial and freely reduced. Thus C C M if C is 
initiated by R\. 
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It remains to show that any trivial and freely reduced word can occur in a 
realisation of a Markov chain C with initial state R\ G S. 

A word w € {a 1 1 , . . . , a k 1 }* represents the identity element in the group if and 
only if it is the product of conjugates of the relators Rf 1 . So w is the word 

s 

n pwj 1 

3=1 

after free reduction, where pj € {a^ 1 , . . . , a^ 1 }* and fj = R^ 1 - 
We can obtain w using conjugation and left-insertion as follows: 

• set w = r%; 

• conjugate by p^ 1 pi one letter at a time to obtain w = p^ 1 pirip^ 1 p2 after 
free reduction; 

• insert (append) r2 on the right; 

• repeat the previous two steps (conjugating by pJ + iPj then inserting rj on 
the left) until r s is inserted; 

• conjugate by p s . 

Since we only ever append r»- to the end of the word, there are no right cancellations, 
and at most \rj\ left cancellations. 

This completes the proof. □ 

Corollary 2.4. The Monte Carlo algorithm is aperiodic, provided that P(R) = 
PiR- 1 ) > andO <p c < 1. 

Proof. Let P r (w — > v) be the one step transition probability from state w to state 
v in the Monte Carlo algorithm. The probability of achieving a transition w — > v 



in N steps is denoted by P^ (w — > v), and by Lemma 2.3 there exists an TVo such 
that P?°(w ->«)>(), if both w,v are freely reduced trivial words. 

The rejection technique used in the definition of both the conjugation and left- 
insertion elementary moves immediately implies that if P^ a (w — > v) > then 
P^ 1 (w — > v) > for all M > Nq. Hence the algorithm is aperiodic. □ 

A Monte Carlo algorithm which is aperiodic, and irreducible on its state space, 
is said to ergodic. Hence, the algorithm above is ergodic on the state space of freely 
reduced trivial words. In these conditions, the fundamental theorem of Monte Meth- 
ods implies the algorithm samples along a Markov chain C = (w n ) asymptotically 
from the stationary distribution given in Lemma [2. 2| 



2.4. Analysis of Variance. The algorithm was implemented an d te sted for ac- 
curacy The stationary distribution of the algorithm (see Lemma 2.2 1 shows that 



the expectation value of the mean length of words sampled for given parameters 
(a,p) is 

y \w\ (\w\ + 1) 1+Q /3H 

where the summations are over all freely reduced trivial words in G. 

We observe two points: The first is that increasing (3 will increase E(|tw|). In 
fact, there is a critical point (3 C such that E(|iu|) < oo if f3 < f3 c , and E(|io|) is 



2 As check on our coding, the algorithm was coded independently by two of the authors (AR and 
EJJvR), and the results were compared. Further, we ran the simulations making lists of observed 
trivial words for short lengths and then compared these against exhaustive enumerations. 
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divergent if /? > fi c . Observe that /3 C is independent of a. The second point is that 
increasing a will generally increase the value of E(|«j|). This is convenient when 
one seeks to estimate the location of j3 c . 

Equation (2.12 1 is a log- derivative of the cogrowth series and will be finite for 
(3 below the reciprocal of the cogrowth (being the critical point of the associated 
generating function) and divergent above it. Because of this, we identify /3 C with 
the reciprocal of the cogrowth. Hence the convergence of this statistic gives us a 
sensitive test of the cogrowth and so the amenability of the group. For example, if 
the mean length of words sampled from a group with 2 generators at /? = e + 1 /3 is 
finite, then the group is not amenable. 

The realisation of a Markov chain C — {wq, w\, . . . , w n , . . .} by the algorithm 
produces a correlated sequence of an observable (for example the length of words). 
We denote the sequence of observables by {0(wi), 0(w 2 ),0(w 3 ), . . . , 0(w n )} . The 
sample average of the observable over the realised chain is given by 

n 

(2.13) (OW»„ = -E°W- 

i=l 

This average is asymptotically an unbiased estimator distributed normally about 
the expected value E(0(w)), given by 

(2 ' 14) E( ° H) = EJM + D^h • 

Hence, {0(w)) n may be computed to estimate the expected value E(0(u;)). 

It is harder to determine the variance in the distribution of {0(w)) n about 
E(0(w)). Although the Markov chain produces a time series of identically dis- 
tributed states, they are not independent, and autocorrelations must be computed 
along the time series to determine confidence intervals about averages. 

The dependence of an observable along a time series is statistically measured 
by an autocorrelation function. The autocorrelation function usually decays at an 
exponential rate measured by the autocorrelation time to along the time series. 
In particular, the measured connected autocorrelation function of the algorithm is 
defined by 

(2-15) So(k) = (0( Wi )0(w i+k )) n - (0(w)f n , 

and is dependent on n, the length of the chain. If n becomes very large, then 
So(k) measures the correlations between states a distance of k steps apart. The 
Markov chain is asymptotically homogeneous (independent of its starting point); 
this implies that (0(wi)) n ~ {0(w)} n if both n and i are large, and if i -C n . 
Thus, for large values of n and i, the autocorrelation time So{k) is only dependent 
on the separation k between the observables 0(wi) and 0(wj+fc). 

Normally, the autocorrelation function of a homogeneous chain is expected to 
decay (to leading order) at an exponential rate given by 

(2.16) So{k)~Coe- k ' TO 

where to is the exponential autocorrelation time of the observable O. The auto- 
correlation time to sets a time scale for the decay of correlations in the time series 
{0(wi)}: If k » to, then the states 0(iVi) and 0(wt + k) are for all practical sta- 
tistical purposes independent. These observations allow us to compute statistical 
confidence intervals on the average {0(w)} n in a systematic way 
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Suppose that a time series of length N of observables {0(wi)} were realised by 
the Markov chain Monte Carlo algorithm. Cut the times series in blocks of size 
M <C N, but with M ^ to- Then one may determine [N/M\ averages estimating 
(0(w)) n over the blocked data, given by 

1 M 

(2-17) [0(w)}i = —J2°( Wi ^ 

j=i 

for i = 0, 1, . . . , [N/M\ -1. 

The sequence of estimates {[O(w)]o, [0(w)]i, . . . , [0(^)]nv/MJ-i} is itself a time 
series, and if these are independent estimates, then for large M -C N its variance 
is estimated by determining 

(2-18) S 2 M ,o = <[0] 2 ) - ([O]) 2 , 

where canonical averages (•) are taken. So if the [0(w)]i are treated as independent 
measurements of E(0(w)), then the 67% statistical confidence interval o~m o is given 

by 

(2.19) cr MO - 



[N/M] - 1 

In practical applications the above is implemented by increasing M -C N until 
°~m.o i s insensitive to further increases. In this event one has M » tq, and ctm,o 
is the estimated 67% statistical confidence interval on the average computed in 
equation (2.13). 

In this paper we consider the average length of words - that is, 0(w) — \w\ 
for each freely reduced and trivial word w sampled by the algorithm. We use our 
algorithm to determine the canonical expected length of freely reduced trivial words 
with respect to the Boltzmann distribution. This is defined by putting a = —1 in 
equation (2.12): 

(2-20) E C (M) = ^ 

where the summation is over all freely reduced trivial words in G, except the empty 
word. 

An estimator of Ec(|u>|) is obtai ned b y putting 0(w) = \w\/(\w\ + 1) 1+Q and 
0(w) = 1/(H + 1) 1+ " in equation ( |2.14[ ). This gives 

e( t , h ) 

(2.21) E C (| W |) = -^ (H+1)1+ ^ 

E 



In other words, for arbitrary choice of a, the ratio estimator 

(\w\/(\w\ + l) 1+a )) 

,2 - 22 » imi -° m: 

may be used to estimate the canonical expected length Ec(|to|) over the Boltzmann 
distribution on the state space of freely reduced trivial words in G. This is par- 
ticularly convenient, as one may choose the parameter a to bias the sampling in 
order to obtain better numerical results. For example, it is frequently the case that 
(long) trivial words in the tail of the Boltzmann distribution are sampled with low 
frequency, and by increasing a the frequency may be increased. This gives larger 



ON TRIVIAL WORDS IN FINITELY PRESENTED GROUPS 



11 



sample sizes on long words, improving the accuracy of the numerical estimates of 
the canonical expected length of words. For more details, see for example Section 14 

in ng. 

2.5. Implementation. The algorithm was implemented using a Multiple Markov 
chain Monte Carlo algorithm (THJ, [32] — an approach that is also known as as 
parallel tempering. This greatly reduces autocorrelations in the realised Markov 
chains and was achieved as follows: Define a sequence of values of /? such that 
< P\ < /?2 < ■ ■ • < Pm < Pc- Separate chains are initiated at each of the Pi 
and run in parallel. States at adjacent values of the Pi are compared and swapped. 
This coupling of adjacent chains creates a composite Markov chain, which is itself 
ergodic (since each individual chain is) with stationary distribution the product 
distribution over all the separate chains. This implementation greatly increases 
the mobility of the Markov chains, and reduces autocorrelations. The analysis of 
variance follows the outline above. For more detail on a Multiple Markov chain 
implementation of Metropolis-style Monte Carlo algorithms, see [23J for example. 

In practice we typically initiated 100 chains clustered towards larger values of 
P where the mobility of the algorithm is low. Each chain was run for about 1000 
blocks, each block a total of 2.5 x 10 7 iterations. The total number of iterations 
over all the chains were 2 x 10 9 iterations, which typically took about 1 week of 
CPU time on a fast desktop linux station for each group we considered. We also ran 
each group at five different a values —1, 0, 1, 2, 3. The larger values of a will ensure 
that we sample into the tail of the distribution over trivial words — in practice 
the different a values gave consistent results. Data were collected and analysed to 
estimate the cogrowth of each group. 

In the next sections, we compare our numerical results with exact analysis of 
the Baumslag-Solitar groups. This will demonstrate the validity of our numerical 
approaches above. 

3. Exact cogrowth series for Baumslag-Solitar groups 

3.1. Equations. Consider the Baumslag-Solitar group 

BS(N,M) = (a,b\a N b = ba M ) = {a^ba- 1 " 1 ^ 1 ). 

Our aim is to compute its cogrowth function, or the corresponding generating func- 
tion. Rather than obtain this directly, we instead consider the set of words (they 
are not required to be freely reduced) which generate elements in the horocyclic 
subgroup (a) — let H be the set of such words. In what follows we will abuse 
notation and when a word w generates an element in a subgroup (a k ), we shall 
write w £ (a fc ). 

Any word in {a ±1 ,b ±1 } can be transformed into a normal-form for the corre- 
sponding group element by "pushing" each a and a -1 in the word as far to the 
right as possible using the identities a ±N b = ba ±M and a ±M b^ 1 = b^ 1 a ±N , and re- 
placing aT % b by a N ~ % aT N b — a N ~ l baT M (and similar for a _J 6 _1 , where < i < N 
and < j < M) so that only positive powers of a appear before a b ±x letter. The 
resulting normal form can be written as Pa k , where k is the a-exponent, and P is 
a word in the "alphabet" {b, ab, . . .a N ~ 1 b,b~ 1 ,ab~ 1 , . . .a^ /_1 6 -1 } that we call the 
prefix (see [27] p. 181). 

Consider a normal form Pa k . 

• Multiplying this on the right by a ±:L results in Pa k±1 . 
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• If k = N£ then multiplying on the right by b results in Pba Me — if P ends in 
a b^ 1 then it will shorten and the a-exponent will be updated accordingly. 

• If k = Ml then multiplying on the right by o -1 results in Pb~ 1 a Ne - 
if P ends in a b then it will shorten and the a-exponent will be updated 
accordingly. 

• Otherwise multiplying by b ±x will change the a-exponent and lengthen the 
prefix. 

Now define g n ,k to be the number of words in % of length n that generate the 
element with normal form a k . Clearly we have g n . & — g n ( _ & . Define the generating 
function 

(3.1) G(z;q) = Y / 9n,kZ n q k . 

n , k 

It is very convenient to define the following subsets of % and their corresponding 
generating functions. 

• Let C be the set of words in T-L that cannot be written as uv where u 
generates an element with normal from 6 _1 a J for any j. 

• Let K be the set of words in T-L that cannot be written as uv where u 
generates an element with normal from bo? for any j. 

Let the generating functions of these words be L(z; q) and K (z; q) respectively. We 
note that L(z; 1) = K{z\ 1), since the inverse of any word in C gives a word in /C 
and vice versa. We then need to define the operator $d,e which acts on the above 
generating functions to annihilate all powers of q except those that have a-exponent 
equal to mod d and which maps them to powers of mod e. 

(3.2) $ d , e o z n Yl c ^i k = E z " E c ^m ej 

n k n j 

With these definitions we can write down a set of equations satisfied by the 
generating functions G(z;q),K(z;q) and L(z;q). 

Proposition 3.1. The generating functions G, K, L satisfy the following system of 
equations. 

L = 1 + z(q + q)L + z 2 L ■ [$jv,m o L + $ M ,N ° K] - z 2 [$m,N ° K] ■ [<& N . N o L] , 
K = 1 + z(q + q)K + z 2 K ■ [® MjN oK + $ NiM o L] - z 2 {® N<M ° L] ■ [<S>m,m ° K] , 

and 

G = 1 + z(q + q)G + z 2 G ■ [<&n,m °L + $m,jv ° K] 

where we have written G = G{z; q) etc. 

We remark that these equations can be transformed into equations for the 
cogrowth series by substituting z H ► ^X t2 and replacing each generating function 

f(z) h-> h(t) 1 1 ^ 3 f 2 . We found it easier to work with the equations as stated. 

Proof. First, we note that the set T-L is closed by prepending and appending the 
generator a and a -1 . We factor % recursively by considering the first letter in any 
word w € T-L (see Figure [TJ. This gives four cases: 

• w is the empty word. 
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L 




K 



Figure 1. Any word in H can be decomposed by considering its 
first letter. There are 5 possibilities, falling into the 4 cases we 
have drawn here. The subwords g, g' G "H, L € C and K G /C. 



• The first letter is a or a^ 1 . Then w = av or w — a~ 1 v for some v G %, 
increasing the length by 1 and altering the a-exponent by ±1. At the level 
of generating functions this gives z(q + q~ 1 )G(z; q). 

• The first letter is b. Factor w = uv where u is the shortest word so that 
u G (a). Thus, u — bu'b^ 1 for some v! G (a N ). The minimality of u 
ensures u' G C. Combined, this gives u G (a M ). At the level of generating 
functions, the maps words counted by z n q kN to z n+2 q kM and resulting in 
z 2 ■ $jv,M ° L(z;q). 

• The first letter is 6 _1 . Factor w — uv where u is the shortest word so 
that u G (a). As per the previous case, u = b~ 1 u'b for some v! G (a M ) 
with v! G K,. Combined, this gives u G (a ). Similar reasoning gives 
z 2 ■ $ M:A r o K(z; q) 

Now consider an clement w G C, and we note that C (and K,) is closed under 
appending the generators a and a -1 , but not prepending. See Figure [2] In a similar 
manner to the above, we factor words in C recursively by considering the last letter 
of w. 

• w is the empty word. 

• The last letter is a or a^ 1 . Then w = va or w = va^ 1 for some v G C, 
increasing the length by 1 and altering the a-cxponcnt by ±1. This yields 
the term z(q + q)L(z; q). 

• The last letter is b^ 1 . Factor w — uv where u is the longest subword such 
that u G (a) and v is non-empty. This forces v = bv'b^ 1 with the restriction 
that v' G C. Since both v,v' G C we must have v' G (a N ) and v G (a M ), 
and this yields z 2 L(z; q) ■ $n,m L(z; q). 

• The last letter is 6. Factor w = uv where u is the longest subword such that 
u G (a) and v is non-empty. This forces v = b~ x v'b with the restriction 
that v' G JC. Further, w G C implies the subword u G" (a ). Otherwise, 
w (ji L as the subword generates an element with normal form o _1 a J 
for some j. 

The generating function for {w G £ | w G" (a ff )} is given by (L — ^at^at 
L), and so this last case gives z 2 (L(z; q) — &n,n L(z; q)) ■ $m,N K(z; q). 

Putting all of these cases together and rearranging gives the result. The equation 
for K, follows a similar argument. □ 
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Figure 2. Any word in C can be decomposed by considering its 
last letter. This results in the four possible factorisations we have 
drawn here. The subwords L, L' € C, K € K, and u is a word in 
C that generates an element in the subgroup (a), but not in the 
subgroup (a N ). 

3.2. Solution for BS(N,N). The number of trivial words of length n in Z 2 = 
BS(1, 1) has long been known to be (J 1 ^) 2 (^ or even n j^J This number grows as 
4 n+1 ' 2 /wn, and the factor of n^ 1 implies that the corresponding generating function 
is not algebraic (see, for example, section VII. 7 of [H]). The generating function 
does satisfy a linear differential equation with polynomial coefficients and so is D- 
finite |31) (in fact it can be written in closed form in terms of elliptic integrals). 
The class of D-finite functions includes rational and algebraic functions and many of 
the most famous functions in mathematics and physics. Indeed, most of the known 
group growth and cogrowth series are D-finite (being algebraic or rational). We 
prove (below) that when N = M, the cogrowth series is D-finite and we strongly 
suspect that when N ^ M, the cogrowth series lies outside this class. 

Proposition 3.2. When N — M the generating functions K(z; q) = L(z; q) and 
the generating functions K — L,G satisfy 

L = 1 + z(q + q)L + 2z 2 L ■ {$ N:N o L] - z 2 [$tv : v ° Lf 
G = l + z(q + q)G + 2z 2 G ■ [$AT :A r o L] 

Further, these equations reduce to a set of algebraic equations in G, L and [^n.n ° L] . 
In particular if we write L (z; q) = [<&n,n ° L], and let oj — e 27ri / N then we have 

NL ( Z ;q) = |>(,;^) = £ , V 

f^o ^ Q l- z(ojq + l/ujq)-2z 2 Lo(z;q) 

For example, for BS(2, 2) the generating function G(z; q) satisfies the following 
cubic equation 

(3.3) 1 + 3zQG - (1 - Az 2 - z 2 Q 2 )G 2 - zQ{l - zQ - 2z){\ - zQ + 2z)G 3 = 0, 
where we have written Q = q + q. 



Perhaps the easiest proof known to the authors is the following. Map any trivial word to a 
path on the square grid. Now rotate the grid 45° and rescale slightly. Each step now changes 
the x-ordinate by ±1 and similarly each jy-ordinate by ±1. In a path of n-steps, n/2 steps must 
increase the x-ordinate and n/2 must decrease it and so giving ( n ™ 2 ) possibilities. The same 

occurs independently for the y-ordinates and so we get ( n ^ 2 ) 2 possible trivial words. 
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Proof. The proof is a corollary of Proposition |3.1| Setting N = M simplifies the 
equations considerably and forces K(z;q) = L(z;q). We note that L (z;q) — 
Lo(z;ujq) and the equation for L^{z;q) follows. Hence both L(z;q) and G(z;q) are 
also algebraic. □ 

We are not interested in the full generating function G, rather we are mainly 
interested in the coefficient of q°. 

Corollary 3.3. For BS(N, N) the generating function [q°]G(z;q) — X)Sn.o z ™ is D- 
finite. That is, it satisfies a linear generating function with polynomial coefficients. 
Furthermore, the cogrowth series (being the generating function of freely reduced 
words equivalent to the identity) is also D-finite. 

It follows that the cogrowth of BS(N, N) is an algebraic number. 

Proof. Every algebraic power series also satisfies a linear differential equation with 
polynomial coefficients (see [31] for many basic facts about D-finite series). It is 
known [26] that the constant term of a D-finite series of two variables is a D-finite 
series of a single variable. Substituting an algebraic series into a D-finite series gives 
another D-finite series, and so transforming from [q°]G(z; q) to the cogrowth series 
(which is done by substituting a rational function) yields another D-finite series. 

Finally, if a function satisfies a linear differential equation, then its singularities 
must correspond to zeros of the coefficient of the highest order derivative. Since 
the cogrowth series is D-finite, its singularities must be the zeros of the polynomial 
coefficient of the highest order derivative. □ 

While the results used to prove the above corollary guarantee the existence of 
such differential equations, they do not give recipes for determining them. There has 
been a small industry in developing algorithms to do exactly this task (and many 
other operations on D-finite series) — for example work by Zeilberger, Chyzak and 
others. Here we have used recent algorithms developed by Chen, Kauers and Singer 
and we are grateful for Manuel Kauers' help in the application of these tools. 

Applying the algorithms described in [XT] to the generating function G(z; q) for 



BS(2, 2) which is the solution of equation (3.3 1 we found a 6 th order linear differ- 
ential equation satisfied by [q°]G(z; q). Unfortunately the polynomial coefficients 
of this equation have degrees up to 47. We were also able to guess slightly more 
appealing equations of higher order with lower degree coefficients, but all are too 
large to list here. 

For BS(3, 3) and BS(A, 4) we obtain the following equations for G(z; q) (where 

Q = q + q) 

(3.4) 1 + AzQG + (6Q 2 z 2 - z 2 - l)G 2 + 2z(Qz + \){Q 2 z -Q + 2z)G 3 

+ z 2 (l - Q)(l + Q)(Qz + 2z- \){Qz -2z- 1)G 4 = 

and 

(3.5) 1 + bGQz + (10Q 2 z 2 - 2z 2 - 1)G 2 + z(10Q'V - 6Qz 2 - 3Q + 4z)G 3 

+ z 2 (3Q 4 z 2 + 2Q 2 z 2 - 3Q 2 + 8Qz - 8z 2 + 2)G 4 

- z 3 Q(Q 2 - 2){Qz + 2z- l)(Qz -2z- 1)G 5 = 

Again applying the same methods, we found an ODE of order 8 with coefficients 
of degree up to 105 for _BS*(3,3) and for BS(4,4) it is order 10 with coefficients of 
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degree up to 154. Using clever guessing techniques (see [24] for a description) Kauers 
also found DEs for N = 5, . . . , 10. For BS(5, 5) the DE is order 12 with coefficients 
of degree up to 301. While that of BS(10, 10) took about 50 days of computer time 
to guess and is 22nd order with coefficients of degree up to 1153; when written in 
text file is over 6 Mb! We note that the ODEs found for N — 2,3,4 have been 
proved, but it is beyond current techniques^] to prove those found for higher N. 

Clearly this approach is not a practical means to study the cogrowth for larger 
N — though one can generate series expansions quite quickly using a computer. 
We are able to determine the radius of convergence of [q°]G(z;q) for much higher 
N via the following lemma. 

Lemma 3.4. ForBS(N,N), the generating functions G(z; 1) and [q°]G(z; q) have 
the same radius of convergence. 

Proof. We start with some notation. Write 

oo n oo 

(3.6) G(z;q)=J2 J2 9n,kZ n q k G(z; 1) = ]T g n z n 



n— k——n 



Note that we have g n .-k — 9n.k and that g n> k = for \k\ > n. Write limsup <?n = M 
and limsup g^Q = fJ-a- Since all the g n ,k are non-negative, we immediately have 
A* > Mo- 

To prove the reverse inequality we use a "most popular" argument that is com- 
monly used in statistical mechanics to prove equalities of free-energies (see [22] for 
example). Fix n, then there exists k* (depending on n) so that g n ,k* > 9n,k — the 
number k* is the "most popular" a-exponent in all the trivial words of length n 
contributing to the generating function G. We have 

(3.7) 5„ :fe * < g n < (2n + l)g n ,k* 

And hence limsup <l, = fi. Note that numerical experiments show that k* = 
— the distribution is tightly peaked around 0. 

Keeping n fixed, consider a word that contributes to g n ,k* and another that 
contributes to g n .-k* ■ Concatenating them together gives a word that contributes 
to <72n, o- So considering all possible concatenation of M such pairs of words gives 
the following inequality 

9n,k*9n-k* — 9n,k* — 92Mn,0 

Raise both sides to the power 1 /2nM and let M — > oo gives 
(3-9) < Mo 

Letting n — > oo then shows that [i < /i . D 

We have observed that the statement of the lemma appears to hold for Baumslag- 
Solitar groups BS(M, N) for M ^ N also, however the above proof breaks down in 



the general case as the number of summands in equation (3.7) grows exponentially 
with n rather than linearly. 

Combining Proposition |3.2| and the above lemma we can establish the growth 
rates of trivial words fi and the corresponding cogrowths A for the first few values 



4 While there is no theoretical barrier, the time taken by the computations seems to grow 
quickly with N and exceed the available time. 
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N 


H 


A 


1 


4 


3 


2 


3.792765039 


2.668565568 


3 


3.647639445 


2.395062561 


4 


3.569497357 


2.215245886 


5 


3.525816111 


2.091305394 


6 


3.500607636 


2.002421757 


7 


3.485775158 


1.936941986 


8 


3.476962757 


1.887871818 


9 


3.471710431 


1.850717434 


10 


3.468586539 


1.822458708 



Table 1. The growth rate \x of trivial words in BS(N,N) and 
the corresponding cogrowth A. Note that fj, and A are related by 
fi = A + 3/ A, and that the growth rate of trivial words in the free 
group on 2 generators is VT2 = 3.464101615. 



of N (see Table [I]). Unfortunately we have not been able to find a general form for 
these numbers. Some simple numerical analysis of these numbers suggests that the 
growth rate approaches VT2 exponentially with increasing N. This finding agrees 
with work of Guyot and Stalder [2T| , discussed below, who examined the limit of the 
marked groups BS(M, N) as M, N — > oo, and found that the groups tend towards 
the free group on two letters, which has an asymptotic cogrowth rate of \/l2. 

We remark that for BS(1, 1) = Z 2 the number of trivial words is known exactly 
and hence so is the dominant asymptotic form 



9n,0 — 




for even n. 



In the case of N = 2, 3, 4, 5 we can show from the differential equations found above 
that 

g„ fi ~ Afff/^n^ 2 for even n 

where /U/v is given in the previous corollary and we have estimated the amplitudes 
to be 

A 2 = 12.47372070225776 ... A 3 = 10.81007294255599 . . . 

A 4 = 12.14125535742978 ... A 5 = 14.73149478993552 .... 

Unfortunately we have not been able to identify these constants, but these obser- 
vations lead to the following conjecture. 

Conjecture 1. The number of trivial words in BS(N,N) grows as 

9nfi ~ A Nl u N n 

for N > 2. 

3.3. Continued fractions and BS(1,M). When we set N = 1 cancellations 
occur and the equation for L becomes a (/-deformation of a Catalan generating 
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function: 

L(z; q) = 1 + z(q + q)L(z; q) + z 2 L(z; q)L(z; q M ) = - -. — — \ r , tjt, 

l-z(q + q)- L(z;q M ) 

(3-10) 

l-2z- VT^Iz 

L(Z;1) = 2? ' 

Setting q = 1 into the first equation reduces it to algebraic and it is readily solved to 
give L(z; 1) which is the generating function of the Catalan numbers. Thus L(z; q) 
is a q-deformation of the Catalan numbers and rearranging the first equation shows 
that L(z; q) has a simple continued fraction expansion. 

(3.11) L(z;q) = - ' — -. 

1 - z{q + q- 1 ) -2 

1 - z(q M + q~ M ) 



1- z(q M2 +q- M2 ) 

Such continued fraction forms are well known and understood in Catalan objects 
(see 1 15] for example). Unfortunately the equation for K does not simplify: 

(3.12) K = 1 + z(q + q)K + z 2 L{z; q M ) ■ [K - $ M , M o K] + z 2 K • [$ M ,i ° K] . 

Though as noted above K(z\ 1) = L(z; 1) and so we expect K(z; q) to be a different 
q-deformation of the Catalan numbers. For G we have made even less progress 
and we have not found G(z;l), let alone G(z;q), in closed form. Because of the 
(/-deformed nature of L(z; q) we conjecture the following 

Conjecture 2. For Baumslag-Solitar groups BS(l, M) with M > 1, the generating 
junctions G(z;q) and [q°]G(z; q) are not D-finite. 

Since any path that contributes to K or L must also contribute to G, it follows 
that the radius of convergence of G(z; 1) is at most 1/4 — and of course cannot be 
any smaller. Since the groups BS(1, N) are all amenable, we know that g Uy o ~ 4 n . 
We have been unable to prove any more precise details of the asymptotic form, 
though it is not unreasonable to expect that 

(3.13) g n , ~A4 n n-™. 

While we have been able to generate the first 50-60 terms of the series for M < 5 
by iterating the equations, the series are quite badly behaved and we have been 
unable to produce reasonable estimates of 7m- 

3.4. When N 7^ M. When N ^ M, we expect that the operators &n,m and 
$M.iv m the equations satisfied by G, K, L give rise to q-deformations similar to 
those observed above. In light of this, we extend our previous conjecture: 

Conjecture 2 (Extended from the above). For Baumslag-Solitar groups BS(N, M) 
the generating functions G(z; q) and [q°]G(z: q) are only D-finite when N — M . 

In spite of the absence of D-finite recurrences, we can still use the equations above 
to determine the first few terms of the cogrowth series. The resulting algorithm to 
compute the first n terms of the series requires time and memory that are exponen- 
tial in n. The coefficient of z n is a Laurent polynomial whose degree is exponential 
in n and this exponential growth becomes worse as max{N/M,M/N} becomes 
larger. In spite of this, iteration of these equations to obtain the cogrowth series is 
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M 


1 


2 


3 


4 


5 


1 


4 


4 


4 


4 


4 


2 


o 


3.792765039 


3.724* 


3.701* 


3.676* 


3 


o 


o 


3.647639445 


3.604* 


3.585* 


4 


o 


o 


o 


3.569497357 


3.538* 


5 


o 


o 


o 


o 


3.525816111 



Table 2. Exact and estimated growth rates of trivial words for 
Baumslag-Solitar groups BS(N,M). The estimated growth rates 
are denoted with a * and we expect that the error lies in the last 
decimal place — due to the difficulty of obtaining estimates, they 
should be considered to be quite rough. 



exponentially faster than more naive methods based on say a simple backtracking 
exploration of the Cayley graph, or iteration of the corresponding adjacency matrix. 

The time and memory requirements can be further improved since we are pri- 
marily interested in the constant term; this means that we do not need to keep high 
powers of q. More precisely if we wish to compute the series to 0(z n ), then we 
only need to retain those powers of q that will contribute to [q°z n ]G(z; q). We must 
compute the coefficients of z k for k < n/2 exactly, but we can "trim" subsequent 
coefficients — the degree of z n / 2+k needs only be that of z n / 2+k . 

Simple C++ code using clij^jrunning on a moderate desktop allowed us to generate 
about the first 50 terms of [q°]G(z: q) for BS{1, 5) while over 300 terms for 55(4, 5) 
were obtained. The series lengths for the other (with N < M < 5) ranged between 
these extremes. We have estimated the growth rate of trivial words using differential 
approximants — see Table [2] Again like the N = 1 case, we find the series to be 
very badly behaved (except when N = M) and we have only obtained quite rough 
estimates. 

3.5. The limit as N,M — > oo. Beautiful work of Luc Guyot and Yves Stalder 
[2"T] demonstrates that in the limit as N, M — > oo the (marked) group BS(N, M) 
becomes the free group on 2 generators. We note that we can recover this free 
group behaviour in the functional equations we have obtained. 

In particular as N, M — > oo, the operators §n,n, &m,m,$n,m and <&m,n become 
the constant-term operators. So in this limit the equations for K and L from 
Proposition |3.1| become 

L = 1 + z(q + q)L + z 2 L ■ [L + K ] - z 2 K L , 

(3.14) K = 1 + z(q + q)K + z 2 K ■ [K + L Q ] - z 2 K L , 

where K (z) — [q°]K(z;q) and L (z) = [q°]L(z; q). Clearly K(z;q) = L(z;q) and 
so with a little rearranging 

(3.15) L(z; q) = i-* 2 ^*) 2 = 1 - z2L o W*(« + ^ 



l-z(q + q)- 2z 2 L (z) 1 - 2z 2 L ^ \1 - 2z 2 L 



^An open source C++ library for computations with large integers. At time of writing it is 
available from http : //www . ginac . de/CLN/ 
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Taking the constant term of both sides then gives 
(3.16) 

, - , ~ 1/2 

L °- l-2z 2 L ^ 



1 _ Z 2 L 2 ^ f2 n\ ( : \ 2n _ 1 - z 2 Ll 



n>0 



n / V 1 - 2x 2 L / 1 - 2z 2 L 



1 - 4 



1 - 2z 2 L 



Simplifying this last expression further gives (3z 2 L0 2 — L0+ l)(z 2 L0 2 — LO — 1) =0. 
The only positive term power series solution of this gives Lq and a similar exercise 
gives [q°]G{z;q): 

(3.17) L = — [q ]G 



Qz 2 1 + 2V1-120 2 

The expression for [q°]G is the number of trivial words in the free group on 2 
generators. 

4. Analysis of random sampling data 

4.1. Preliminaries. Using our multiple Markov chain Monte Carlo algorithm we 
have sampled trivial words from the following groups: 

• Thompson's group F with the following 3 presentations 

(4.1) (a,b\[ab- 1 ,<T 1 ba], [aM, a^foa 2 ]), 

(4.2) (a, b,c,d\c = a~ 1 ba, d = a~ 1 ca, [ab~~ , c], [ab^ 1 , d]), 

(4.3) (a, b,c,d,e\c = a~ 1 ba, d = a~ 1 ca, e = ab^ 1 , [e, c), [e, d]}. 

Note that the generators a,b,c,d above are often called xq, xi, x 2 , x 3 re- 
spectively in Thompson's group literature. We have use some simple Tietze 
transformations (see |27j p. 89) to obtain the second and third presenta- 
tions from the first (standard) finite presentation of F. 

• Baumslag-Solitar groups BS(N, M) with 

(N, M) = (1, 1), (1, 2), (1, 3), (2, 2), (2, 3), (3, 3), (3, 5). 

• The Basilica group has presentation 

(4.4) G = (a, b, | [a™, [a™, b n }] and [b n , [b n , a 2n ]] , n a power of 2) 
and embeds in the finitely presented group [2D] 

(4.5) G = (a,t\a t2 = a 2 , [[[a,t~ l ],a] , a] = 1). 

The groups G and G are both amenable [4]. 

We examined two presentations of G: The first is obtained from the 
above by putting b = [a, t ], and the second by putting b — a 1 . Simplifi- 
cation gives the representations 

(4.6) G = (a,b,t | [a,*" 1 ] = b , a' 2 = aa , [[6, a], a] = 1), 

(4.7) G = (a, M| a* = b ^ = fl2 ,b- 1 aba- 1 b- 1 a~ 1 ba = 1). 

• Other groups for which the cogrowth series is known: 

K x = (a, 6 1 a 2 = 6 3 = 1), 
K 2 = (a, 6 | a 3 = 6 3 = 1), 

(4.8) K 3 = (a, Ml a 2 = b 2 = c 2 = 1). 
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The exact solutions for BS(1, 1) = Z 2 , BS(2, 2) and BS(3, 3) are described above, 
and for the other Baumslag-Solitar groups we have computed series expansions. 
For the last three groups, the cogrowth series were found by Kuksov |25j and are 
(respectively) 



C(t) = 

C{t) = 
(4.9) 
C(t) = 



(1 + t) ( [0, -1, 1, -8, 3, -9] + (2 - t + 6t 2 ) y/[l, -2, 1, -6, -8,-18, 9, -54, 81 



2(1 - 3t)(l + 3t 2 )(l + 3< + M 2 )(l - t + 3i 2 ) 
(1 + t)(-t + Vl-2t-t 2 - 6t 3 + 9t 4 ) 



(l-3<)(l + 2i + 3i 2 ) 



-1 - 5t 2 + 3-y/l - 22i 2 + 25i 4 



2(1 - 25t 2 ) 

where [c , ci, . . . , c„] = c + c\t + ■ ■ ■ + c n t n . 

In each case we have obtained estimates of the mean length of freely reduced 
words as a function of j3. More precisely, for each group we estimated 

(4 10) E(n*)(fl) = S " Hfc(H + 1)1+Q/3H 

for k — ±1,±2 and a range of different a values and where the sum is over all 
non-empty freely reduced trivial words. These expectations are dependent on a, 



but one can use Equation (2.22 1 to form a-independent estimates of the canonical 
expectations. Given a knowledge of the cogrowth series we can quickly compute 
these same means to any desired precision, since we can also write 

,fev^ _ V- n k {n + l) 1+a p n (3 n 



n>0 

where p n is the number of freely reduced words of length n. Note that as a is in- 
creased, the samples are biased towards longer words. This expression is convergent 
for j3 below the reciprocal of the cogrowth (being the critical point of the associated 
generating function) and divergent above it. The convergence at the critical point 
depends on the precise details of the asymptotics of p n and will be effected by a. 
This then points to a simple way to test for amenability: 



If the mean length of sampled words from a group on k generators is finite 
for (3 slightly above /3 C = (2k — 1) _1 then the group is not amenable. 



4.2. Amenable groups. We studied the groups 1? = BS(1, 1), BS(1, 2) and 
-B>!5(1,3). The cogrowth series for Z 2 is known exactly, while we relied on our 
series expansions to compute statistics for the other two groups — Figure [3] shows 
the plots of the mean length as a function of j3. 

In the case of BS(1, 1) = Z 2 we see excellent agreement between the numerical 
estimates generated by our algorithm and the mean length computed from the exact 
cogrowth series. For BS(1, 2) and BS(1, 3) we see good agreement for low (3 between 
our numerical data and mean length computed from the exact cogrowth series. 
However at larger values of (3 it appears that the cogrowth series systematically 
underestimates the mean length, compared to the numerical Monte Carlo data. 
This is, in fact, due to the modest length of the cogrowth series used to compute 
mean lengths. For BS(1,2) and BS(1,3) we were only able to obtain series of 
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(1.2.1 



(1.27.1 



().:! 



(1.32.1 



(c) BS(1, 3) sampled with a = 2. 



Figure 3. Mean length of freely reduced trivial words in 
Baumslag-Solitar groups BS(1 ,1),B S(1, 2) and BS(1, 3) at differ- 
ent values of j3 — see equation (4.10 ) with k = 1 and a as indicated. 
The sampled points are indicated with impulses, while the exact 
values are given by the black line. There is excellent agreement 
in the case of BS(1, 1), but a systematic error for BS(1,2) and 
55(1,3) coming from the modest length of the exact series. The 
dotted lines in these two cases indicate mean length data generated 
from longer but approximate series. 



length 60 and 56 respectively due to memory constraints. Given longer series we 
expect much better agreement. 

One can, for example, compute longer "approximate" cogrowth series by ignoring 

one 



small terms. When iterating the functional equations given in Proposition 3.1 
can form reasonable approximations by discarding coefficients g n ^ which are small 
compared to nearby coefficients]^] More precisely we found that if we discard g n ^ 
when 2 12 • g n ^ < X^fcffn,*:; then we obtain good approximations of the cogrowth 
series. This means that only the large central coefficients are kept and far less 
memory used. This made it feasible to approximate the cogrowth series out to 



^Rather than iterating the equations for G(z; q) and then transforming the result to get an 
approximate cogrowth series, we found that our approximation procedure worked best if we iter- 
ated the sli ghtly more complicated equations for the cogrowth series directly — see text following 



Proposition 3.1 for a description of those equations. 



ON TRIVIAL WORDS IN FINITELY PRESENTED GROUPS 



2:S 



around 200 or 300 terms. Of course, the results of these approximation should only 
be considered a rough guide as we have not bounded the size of any resulting errors. 
That being said, we see very good agreement between these approximations and 
our numerical data. 

As noted above, we had great difficulty fitting the series data for BS(1,2) and 
55(1,3). We believe that this is due to the presence of complicated confluent 
corrections (likely logarithmic terms) . Similar corrections also appear to be present 
in the mean-length data for these groups and we were unable to find convincing or 
consistent fits to any reasonable functional forms. We did, however, find that the 
estimated standard error was a good indicator of the location of the singularity: 
The standard error will diverge as (3 approaches the critical value of We found 
that linear or quadratic least squares fits of the reciprocal of the error, and finding 
their ^-intercept gave consistent, though perhaps slightly low, estimates of the 
location of the singularity. See Figure [4] The extrapolations give estimates (3 = 
0.330 ± 0.0002,0.332 ± 0.002 and (3 = 0.332 ± 0.002 for B5(l, 1), BS(l, 2) and 
55(1,3) respectively. 

Error bars above were determined by estimating a systematic error in our data. 
The systematic error was determined by considering the spread of estimates due 
to our choices of the parameter a, the number of data points in the fits, and the 
chosen functional form for extrapolating the data. We believe that our results give 
a good indication of quality of the estimates, though we are reluctant to express 
them as firm confidence intervals. The same general approach to the data for the 
other groups are followed below. 

The HNN-extension of the Basilica group were similarly submitted to Monte 



Carlo simulation by using the representations (4.6| and (4.7). The canonical ex- 



pected length of the words, (|u>|), were computed using the ratio estimator (2.221, 



and turned out to be remarkably insensitive to the parameter (3 (see Figure [5]). This 
made this group more challenging from a numerical perspective than the Baumslag- 
Solitar groups discussed above. Putting a = 5 finally gave acceptable results: The 
sample average length show a divergence close to the critical point (since this group 
is known to be amenable, this is expected to be at (3 = 0.2). As in the case of the 
Baumslag-Solitar groups, the critical value of (3 was determined by extrapolating 
the reciprocal of the error. Extrapolating the curve corresponding to representa- 



tion (4.6) gave (3 C = 0.2f7 and for representation (4.7), f3 c = 0.204. Taking the 
average and using the absolute difference as a confidence interval gives the estimate 
f3 c = 0.21 ± 0.01 to two digits accuracy. 

4.3. Non-amenable groups. The groups BS(N,M) with (N,M) = (2,2), (2,3), 
(3, 3), (3, 5) and the groups Ki, K2, K3 contain a non-abelian free subgroup and so 
are non-amenable. In the case of the groups K\ and K2 the free subgroups are 
F((a6), (a& -1 )), and for K 3 the free subgroup is F((ab), (ac)). 

As noted above, the exact cogrowth series is known exactly for Kuksov's examples 
and BS(2, 2), 55(3, 3), so we were able to compute the mean length curves exactly 
— see Figures [6] and [8| As above, we have estimated the location of the dominant 
singularities for all of these groups — see Figures [7] and [9j 

Unfortunately we have been unable to solve 55(2, 3) and 55(3, 5), but we used 
the recurrences of the previous section to compute the first 100 and 120 terms (re- 
spectively) of their cogrowth series. And as was the case for BS(1, 2) and -85(1, 3) 
we also computed an approximation of the cogrowth series using the same method 
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(a) 55(1, 1) sampled with 
a = 0,1,2,3. 



(b) £S(1,2) sampled with 
a = 0,1,2,3. 



1000 




(c) £5(1,3) sampled with 
a = 0, 1,2,3. 

Figure 4. Plots of the reciprocal of estimated standard error in 
the mean length vs beta for a = 0, 1, 2, 3 anti-clockwise from the 
top. We expect that as /3 approaches its critical value, that the 
standard error will diverge. We see that if we extrapolate the 
curves then they cross the x-axis at /? = 0.330±0.002, 0.332±0.002 
and /3 — 0.332±0.002 respectively — thus these extrapolations give 
good estimates of the critical value of j3. 



described above. These are plotted against our Monte Carlo data in Figures |T0| 
and [n] 

In all cases we see strong agreement between our numerical estimates and the 
mean length curves computed from series or exact expressions. As was the case 
with the amenable groups above, fitting the reciprocal of the estimated standard 
error gives quite acceptable estimates of the location of the dominant singularities 
and so the cogrowth. 



4.4. Thompson's group. Finally we come to Thompson's group for which we ex- 
amine three different presentations as described above. Repeating the same analy- 
sis we used on the previous groups we see no evidence of a singularity in the mean 
length at the amenable values of (3 — see Figures [12] and [13] Indeed our estimates 
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(a) Mean length with a = 5. 
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(b) err -1 with a = 5. 



Figure 5. Numerical data on the HNN-extension of the Basilica 
group. Data points indicated by □ corresponds to the representa- 



tion in equation (4.6) and by x to the representation in equation 
(4.7). In both simulations a = 5. On the left is a plot of the 
canonical expected length (n). These expected lengths are only 
weakly dependent on (3. On the right is the reciprocal error bar on 
our data. This demonstrates that the error diverges as f3 /• 0.20, 
consistent with the fact that these this group is amenable. 



of the location of the dominant singularities are 

(4.12) p c = 0.395 ± 0.005, 0.172 ± 0.002 and 0.134 ± 0.004 respectively. 

These give cogrowths of 2.53 ± 0.03, 5.81 ± 0.07 and 7.4 ± 0.2, all of which are well 
below the amenable values of 3,7 and 9. We take this to be very strong evidence 
that Thompson's group F is not amenable. 



5. Conclusions 

We have introduced a Markov chain on freely reduced trivial words of any given 
finitely presented group. The transitions along the chain are defined in terms of 
conjugations by generators and insertions of relations. These moves are irreducible 
and satisfy a detailed balance condition; the limiting distribution of the chain is 
therefore a stretched Boltzmann distribution over trivial words. 

In order to validate the algorithm we have implemented it for a range of finitely 
presented groups for which the cogrowth series is known exactly. We have also 
added to this set of groups by finding recurrences for the cogrowth series of all 
Baumslag-Solitar groups. Unfortunately, these recurrences do not have simple 
closed-form solutions, but can be iterated to obtain far longer series than can be 
found using brute-force methods. In the case of BS(N, N), the recurrences simplify 
significantly and we are able to compute the cogrowth exactly. For N = 1, ... ,10 
we have found differential equations satisfied by the cogrowth series which can be 
used to generate the cogrowth series in polynomial time. 

We see excellent agreement between our mean-length estimates and those com- 
puted exactly for several groups. As a further check on our simulations, two of the 
authors independently coded the algorithm and compared the results. We can use 
our data to estimate the location of the singularity in the generating function of 
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(a) (a, b\a 2 , b 3 ) sampled with a = 0. 



0.3 0.32 0.34 0.36 

P 

(b) (a, b\a 3 , b 3 ) sampled with a = 0. 




(c) {a, b, c\a 2 , b 2 , (?) sampled 
with a = 1. 

Figure 6. Mean length of freely reduced trivial words of the in- 
dicated groups. The sampled points are indicated with impulses, 
while the exact values are given by the black line. We have used 
vertical lines to indicate j3 = l /3, l /s (respectively) and also the 
reciprocal of the cogrowth where the statistic will diverge — being 
0.3418821478,0.3664068598 and 0.2192752634 respectively. There 
is excellent agreement between the numerical and exact results, 
except possibly at the very highest j3 values. 



freely reduced trivial words. The location of this singularity is the reciprocal of 
the cogrowth and so turns out to be an excellent way to predict the amenability of 
groups. To test this, we used our algorithm on a range of different amenable and 
non-amenable groups. In each case we found that our numerical estimate of the 
cogrowth was completely consistent with the known properties of the groups. In 
particular, where cogrowth is known exactly, our numerics agreed. For each non- 
amenable group, the numerical "signal" was robust — no evidence of a singularity 
was seen at the amenable value. 

Most importantly, we see absolutely no evidence that the mean length of Thomp- 
son's group is divergent close to the amenable value; i.e. for 2,4 and 5 generator 
presentations we see no evidence of a singularity at = 1 /3, 1 /7 or 1 /% (respectively). 
Indeed, in each case, the mean length appears to be very smooth for /3-values some 
reasonable distance above these points. Varying a or examining other statistics 
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(a) (a, b\a? ,b :i ) sampled with 
a = 0,1,2,3. 
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(b) (a,6|a 3 ,b 3 ) sampled with 
a = 0,1,2,3. 




(c) (a, b, c\a 2 , b 2 , (?) sampled 
with a = 0,1,2,3. 

Figure 7. The reciprocal of the estimated standard error vs /3 for 
the indicated groups with a = 0, 1, 2, 3 (clockwise from top in each 
case). We see that the extrapolations of the curves intersect the 
x-axis very close, but slightly short, of the indicated critical values 
of P — 0.3418821478, 0.3664068598 and 0.2192752634 respectively. 
Hence these give good, but slightly low, estimates of the location of 
the singularities /3 = 0.340 ± 0.002, 0.365 ± 0.002 and 0.219 ±0.001. 



does not result in any substantial change with the result that values of /3 consistent 
with amenability are excluded from our estimated error bars. Overall, our numeri- 
cal data lead us to the conclusion that Thompson's group F is not amenable with 
high probability. 

To further test this hypothesis we also examined a generalisation of Thompson's 
group, namely F(3) (see [7])- We used our algorithm to compute the mean length 
of freely reduced words in two presentations of -F(3), namely 

(5.1) (a, b,c,d,e\d = b a ,e = c b = c a , d c = d b = d a , e d = e c = e b = e a ) 

(5.2) (a, 6, c | c b = c a , {b a ) c = {b a ) b = (6 Q ) a , (c b ) (fca) = (c b ) c = (c b ) b = (c b ) Q ), 

where x v — y~ 1 xy^ Note that the first presentation can be written as 3 relations 
of length 4 and 5 of length 6, while the second can be written as 1 relation of length 



As was the case above, the generators a, b, c, d, e are more usually written xq, . . . ,x$. 
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(a) BS(2,2) sampled with a = 1. 
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(b) BS(3,3) sampled with a = 1. 



Figure 8. Mean length of freely reduced trivial words in 
Baumslag-Solitar groups BS(2, 2) and BS(3, 3) at different values 
of j3. The sampled points are indicated with impulses, while the 
exact values are given by the black line. We have used vertical lines 
to indicate f3 — 1 /3 and also the reciprocal of the cogrowth where 
the statistic will diverge — being 0.3747331572 and 0.417525628 
respectively. We see excellent agreement between our numerical 
data and the exact results, and our error bars are smaller than the 
impulses. 
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(a) BS(2,2) sampled with 
a = 0,1,2,3. 



(b) B5(3,3) sampled with 
a = 0,1,2,3. 



Figure 9. The reciprocal of the estimated standard error of the 
mean length as a function of (3 for BS(2,2) and BS(3, 3). In 
both plots we show 4 curves corresponding to simulations at a = 
0, 1, 2, 3 (anti-clockwise from top) and denote the singular values — 
0.3747331572 and 0.417525628 respectively — with vertical lines. 
Extrapolating the curves give estimates of f3 c — 0.372 ± 0.002 and 
0.416 ± 0.001 respectively. 



6, 4 of length 10 and 1 of length 14. As was the case for Thompson's group F we 
found no evidence of singularities at (3 = 1 /9, V 5 respectively. 
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(a) BS{2,3) sampled with a = 1. 



(b) BS(3, 5) sampled with a = 0. 



FIGURE 10. The mean length of trivial words in BS(2, 3) and 
BS(3, 5) at different values of /3. We see very good for low and 
moderate values of /3 and by systematic errors for larger j3. Again, 
this error arises from from the modest length of the exact series. 
The dotted curves in these two cases indicate mean length data 
generated from longer but approximate series, while the dotted 
vertical lines indicate the estimated critical value of (3 from series. 
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(a) BS(2,3) sampled with 
a = 0,1,2,3. 



(b) BS(3,5) sampled with 
a = 0,1,2,3. 



Figure 1 1 . The reciprocal of the estimated standard error of the 
mean length as a function of f3 for BS(2, 3) and BS(3, 5). In both 
plots we show 4 curves corresponding to simulations at a = 0, 1, 2, 3 
(anti-clockwise from top). Extrapolating these curves we estimate 
P c = 0.388 ± 0.02 and 0.444 ± 0.002 for BS{2, 3) and BS{3, 5) 
respectively. These are quite close to the estimates from series of 
0.393 and 0.443 (indicated with vertical lines). 



As an additional note, we have applied our methods to a finitely generated, 
but not finitely presented group — namely the lamplighter group. In this case the 
algorithm has to be modified slightly. One can no longer choose relations uniformly 
at random, but instead we choose them from distribution P(R) over the relations. 
As noted in section 2, this distribution must be positive and and one must have 
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(c) Presentation 1 4.3 1 for F 
sampled with a = 1 



Figure 12. Mean length of freely reduced trivial words in Thomp- 
son's group F at different values of j3. The solid blue lines indicate 
the reciprocal of the cogrowth of amenable groups with k genera- 
tors j3 c — 1 /2k-i. The dashed blue lines indicate the approximate 
location of the vertical asymptote. In each case, we see that the 
mean length of trivial words is finite for /3-values at and slightly 
above j3 c . This is strong evidence that Thompson's group is not 
amenable. 



P(R) = P(R~ 1 ). With these conditions the algorithm remains ergodic on the 
space of trivial words and the stationary distribution is still a stretched Boltzmann 
distribution. This leaves a great deal of freedom in choosing P, and our experiments 
indicated that our results were quite independent of P and are consistent with the 
amenability of the group. 
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(c) Presentation | |4.3| l for F 
sampled with a = 0, 1, 2, 3. 

Figure 13. The reciprocal of the estimated standard error of 
the mean length as a function of (3 for the three presentations 
of Thompson's group. In each plots we show 4 curves correspond- 
ing to simulations at a = 0,1,2,3 (anti-clockwise from top). Ex- 
trapolating these curves leads to estimates of j3 c of 0.395 ± 0.005, 
0.172 ± 0.002, 0.134 ± 0.004. These are all well above the values 
of amenable groups and so we take this to be strong evidence that 
Thompson's group is not amenable. 
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